>> AMOS Code Optimisation Tips << ----------------------------- Aminet - dev/amos/Optim_Tips.lha Compiled by: Ben Wyatt bwyatt@paston.co.uk Contributors: Garfield Benjamin gbenjam@ix.netcom.com Ben Wyatt bwyatt@paston.co.uk Paul Hickman ph@doc.ic.ac.uk Tim Lewis T.Lewis@bton.ac.uk Petri Hakkinen mystic@tlti.tokem.fi Mark de Jong JP.de.Jong@inter.nl.net Paul Heald u5o83@cc.keele.ac.uk Here are a few optimisation tips for those of you trying to squeeze that little bit more speed out of code. With the follow examples, the speed increase may not be noticeable, but they will all make it that little bit faster. Important: Do not go through your entire AMOS program collection, converting all your code so it's slightly faster. Only alter code which you think is /too/ slow. For example, don't change your 3D renderer just to cut down the rendering time 1/50 of a second or something silly. And similarly, don't bother fiddling with the main loop of a game when it already runs in a Vbl! A lot of these optimisation tips produce some messy code so often it isn't worth changing things! Note: Some of these speed increases only work when compiled, mainly with the AMOSPRO Compiler. A lot of them are only tested this way. It must also be stated that these tips may be incorrect for some CPUs - they have mainly been tested on an 68020, but caches and different instruction cycle timings could alter the results. Some sort of chart may be available in future releases of this document - the best idea is to experiment on your system. Each tip is marked by a *, followed by the amount of speed increase, shown by a S, M, L or V. These stand for Small, Medium, Large and Varies. ------------------------------------------------------------------------------ >> COMPARISONS << ------------- *S Don't use Sgn(n), use n<0 and n>0 instead. Eg. Don't use: If Sgn(n)=-1 Then blah blah If Sgn(n)=1 Then blah de blah Use this instead: If n<0 Then blah blah If n>0 Then blah de blah *M Use Abs(n)-v and n-5 and n<5 Then blah Use this instead: If Abs(n)<5 Then blah *S Use < and > instead of <= and >= whenever possible. They are slightly faster. Strange, but true. Eg. When comparing immediate-values Don't use: If n<=10 Then blah Use this instead: If n<11 Then blah But, when using variables, don't use this kind of thing: A=10 If B>A-1 Then etc instead of: A=10 If B>=A Then etc *M Don't do stuff like If n=True, If n=False, or If n<>0. Eg. Don't use: If a=True or b=False or c<>0 Then blah blah Use this instead: If a or Not(b) or c Then blah blah But ONLY do this if you are sure b is a boolean value. "Not" simply bitflips an integer so Not -1 is 0, but Not 2 is $FFFFFFFE. AMOSPro is rather buggy with the Not() command, so you may have to experiment a bit. Also, using =False instead of =0 is faster. Another trick you should be aware of is that you can use the optimised version of <>0 where you would normally use >0 (or <0) if the routine only returns values >=0. Eg. Don't use: If Instr(a$,"Something")>0 Then blah Do this instead: If Instr(a$,"Something") Then blah ------------------------------------------------------------------------------ >> ARRAY ACCESS << -------------- *M For your most used array of 8 (or less) elements, use Dreg() instead of it (it can still hold the same range of values). Note: The Dreg() array isn't actually using the data registers; it's an AMOS internal data structure. However, it's still faster than a standard array. Eg. Don't use: Dim n(7) For a=0 to 7:Print n(a):Next a Use this instead: For a=0 To 7:Print Dreg(a):Next a *S Use single dimensional arrays instead of two dimensional ones if appropriate. Eg. Don't use: Dim A(2,100) Use this instead: Dim A0(100),A1(100),A2(100) ------------------------------------------------------------------------------ >> MATH OPERATIONS << ----------------- *L Whenever possible use powers of 2 when multiplying or dividing. The AMOSPro Compiler will automatically optimise these to lsl and lsr. DON'T use the lsl or lsr commands in extensions, as they are over TWICE as slow! Eg. Don't use: n=a*30 Use this instead (if suitable): n=a*32 As well as this, you can also do: n=x*160 -> n=x*(128+32) -> n=x*128+x*32 (tadaa) Also n=x*192 -> n=x*128+x*64 etc. But this kind of thing is not faster: a=32 : b=7 : c=a*b *L Never use floating points. Use integers multiplied by a power of 2. You can actually decide what sort of precision you want in your decimal places and then multiply them out. For example, if you decide on about 2 point accuracy, you can multiply all your values by 128 (2^7, close enough to 100) and then do the calculations. When you have the results simply divide by 128 to get the required result. Eg. Don't use: x#=x#*1.5 Plot x#,100 Use this instead: x=x*(3*128) : Rem Same as x=x*(1.5*256) Rem 3*128 is calculated at compilation time Plot X/256,100 *L Predefine as much as possible. Especially useful for Sin and Cos etc. Eg. Don't use: Repeat If Sqr(x*x+y*y)=10 Then blah blah blah Until Something Use this instead: ' This code should be placed at the beginning of your code. xmax=Maximum value of x : ymax=Maximum value of y Dim QUICKSQR(xmax,ymax) For x=0 to xmax For y=0 to ymax QUICKSQR(x,y)=SQR(x*x+y*y) Next y Next x ' [Other code] Repeat If QUICKSQR(Abs(x),Abs(y))=10 Then blah blah blah Until Something *S Use N=N+A instead of using Inc, Dec or Add. Although the manual claims they are faster, when using the AMOSPro Compiler, they actually turn out slightly slower. Eg. Don't use: Inc a : Dec b : Add c,10 Use this instead: a=a+1 : b=b-1 : c=c+10 Note: Although Add is slower in it's short form, the full version with the base To top part is faster than the equivilant code. This only applies to standard variables - DON'T use this with arrays! ------------------------------------------------------------------------------ >> MISCELLANEOUS << --------------- *L Get yourself a fast extension, such as AMCAF or Turbo Plus. This will only speed things up where an extension command replaces an internal command with a faster one - Easylife and AMCAF will do this for many string operations and, AMCAF, Turbo and Powerbobs do it for graphics. Eg. Don't use: Plot x,y,c Use this instead: Turbo Plot x,y,c (for AMCAF) or F Plot x,y,c (for Turbo Plus) Many extensions can be found on Aminet (FTP from wuarchive.wustl.edu in systems/amiga/aminet/dev/amos). *V Use assembler procedures for inner loops. Beware though, it does take a small amount of CPU time to jump to an assembler procedure, so with small routines, it's simply not worth it. *S Use "Copper Off" to disable the screen while doing intense communication. The copper then stops stealing clock cycles from the processor (Unlike Multi No and such instructions, this really does work). Use "Copper On" to re-enable the normal system. *L Use Fill to clear an array to a certain value, rather than the equivilant For / Next loop. Eg. Don't use: For n=0 to 1000 A(n)=27 Next n Use this instead: Fill Varptr(A(0)) To Varptr(A(1000)),27 *S Use global variables to return parameters from procedures, rather than Param. Eg. Don't use: _HELLO[10,20] : pram=Param : Print pram Procedure _HELLO[x,y] temp=0 For n=x To y If a(n)=0 Then temp=n : Exit Next n End Proc[temp] Use this instead: Global pram _HELLO[10,20] : Print pram Procedure _HELLO[x,y] pram=0 For n=x To y If a(n)=0 Then pram=n : Pop Proc Next N End Proc However, don't do things like this, which are actually slower. Don't do: _HELLO[10,20] : Print A Procedure _HELLO[X,Y] Shared A A=X*Y End Proc Do this: _HELLO[10,20] : Print Param Procedure _HELLO[X,Y] End Proc[X*Y] *L Use subroutines instead of procedures. Although they're messier, they are several times faster! Of course, with this method, you can not pass parameters, but all the variables you were using will still be accessible. Eg. Don't use: _SOMETHING Procedure _SOMETHING Code End Proc Use this instead: Gosub _SOMETHING _SOMETHING: Code Return ------------------------------------------------------------------------------ >> TRADITIONAL OPTIMISATIONS << --------------------------- *V Move invariant statement outside loops (Standard optimising compiler stuff). Eg. If you have something like this: Repeat If x> GRAPHICS << ---------- *M Use Cls col,x1,y1 To x2,y2 instead of Bar x1,y1 To x2,y2. Note: Use R Bar in Turbo instead if possible. *S Use Polyline to draw boxes instead of Box. Note: Use R Box in Turbo instead if possible. Eg. Don't use: Box 10,10 To 10,10 Use this instead: Polyline 10,10 To 20,10 To 20,20 To 10,20 ------------------------------------------------------------------------------ >> STRINGS << --------- *L Use Peek(Varptr(a$)+x-1) instead of Asc(Mid$(a$,x,1)) and Poke Varptr(a$)+x-1,Asc(b$) instead of Mid$(a$,x,1)=b$. Eg. Don't use: A=Asc(Mid$(a$,3,1)) Mid$(a$,6,1)=b$ Use this instead: A=Peek(Varptr(a$)+2) Poke Varptr(a$)+5,Asc(b$) : Rem Asc(b$) could be predefined Note: This is just under three times as fast! *M This routine is for making the string 1 character shorter. It may be helpful in an program to replace the input command (if you'd ever need to optimise such a routine which is unlikely). Eg. Don't use: S$=Space$(1000) For X=1 To 1000 S$=Left$(S$,Len(S$)-1) Next X Use this instead: S$=Space$(1000) POS=Varptr(S$) : Rem Put this statement inside the loop if you Rem do any standard string operations. For X=1 To 1000 Doke POS-2,Deek(POS)-1 Next X Note: This routine could be rather risky, as it is done outside the AMOS string handling system. *S Don't use Chr$(code) in routines but replace them with a string. Eg. Don't use: I=Instr(ST$,Chr$(0)) Use this instead: C0$=Chr$(0) : Rem At the beginning of your prog I=Instr(ST$,C0$) : Rem In the routine *V A quasi-optimisation: Use "n=Free" at points where you don't care about the speed, to reduce the chances of the AMOS string garbage collector being called when you do care about the speed. This will only have an effect when strings are used in the routine. ------------------------------------------------------------------------------ >> STYLE OPTIMISING << ------------------ *S Use a Repeat / Until loop instead of a For / Next loop. Eg. Don't use: For n=0 to 10 [code] Next n Use this instead: n=0 Repeat [code] n=n+1 Until n=11 : Rem Must be 1 more than with the original For *S Use "If condition Then code" instead of "If condition : code : End If", if the code is short. Eg. Don't use: If a=2 la-de-da End If Use this instead: If a=2 Then la-de-da *S Don't use Start(BANK), Screen Width or Screen Height in a routine that must be fast. In fact, do the same with any AMOS variable that won't change in the loop, including Joy(), JoyUp(), Fire(), etc. if you have several references to these in the same "loop". Eg. Don't use: For X=1 To 10000 PE=Peek(Start(10)+X) Next X Use this instead: ST=Start(BANK) For X=1 To 10000 PE=Peek(ST+X) Next X *M Use the value of expressions, rather than testing what they mean. Eg. Don't use: If Jup(1) Then Y=Y-1 If Jdown(1) Then Y=Y+1 Use this instead: Y=Y+Jup(1)-Jdown(1) Remember: If it is True, -1 is returned, otherwise 0 is returned. The Turbo (Plus) extension has a useful command called "Texp" which returns specified values for True and False. Although it isn't very fast, it produces neater code. *M Don't use parameters in Def Fns. As functions are local to procedures, any variables used, will be known in the function. Eg. Don't use: Def Fn C(x,y)=(x*y*y+x) mod 16 For x=0 To 10 For y=0 To 10 Plot x,y,Fn C(x,y) Next y Next x Use this instead: Def Fn C=(x*y*y+x) mod 16 For x=0 To 10 For y=0 To 10 Plot x,y,Fn C Next y Next x *S Use as few brackets in equations and If structures as possible. Eg. Don't use: A=(B*C)-(D*E)-(F+G) C=(A=2 or B=100) Use this instead: A=B*C-D*E-F-G C=A=2 or B=100 *M The fastest way to display a map, using a few of the above tips: ' This is the slow way, that I've seen in multiple sources of games For Y=0 To FHV For X=0 To FWV Paste Icon 16+X*16,16+Y*16,Peek((FPOSY+Y)*FW+Start(BLS1)+FPOSX)+1 Next X Next Y ' Now my quick version... Note that the start(bank) ' is placed in front of the routine. ' FHV = Field Height View ' FHV = Field Width View ' FPOSX = Field POSition X (Where the player is looking in the map) ' FPOSY = Field POSition Y POS=Start(BLS1)+FPOSX For Y=0 To FHV MPY=(FPOSY+Y)*FW+POS SPY=16+Y*16 For X=0 To FWV ' Using a Turbo Plus extension command F Paste Icon 16+X*16,SPY,Peek(MPY+X)+1 Next X Next Y *V Try various ways of writing code to see if you can find the quickest. THIS is probably the SINGLE most important TIP for blazing speed. You can fine-tune, even convert it to pure Assembler but, it may be MUCH slower than another method you have not yet tried!! Find that faster method!! Something like this is a nice way to test your improvements: Timer=0 For n=0 To 10000 [unoptimised code] Next n Print "Time for unoptimised code:";Timer Timer=0 For n=0 To 10000 Optimised code Next n Print "Time for optimised code:";Timer Note: This kind of test becomes quite useless if you have a CPU cache - The second loop may report to be slower even if it's exactly the same. When doing such tests, you should disable the cache if possible. The key to gaining the absolute BEST speed is to ask yourself WHY is this routine so slow? WHAT is slowing it down? The excellent tips above will probably provide you with the answer more often than not but the secret is you have to know what to look for. Think about what the computer does BEHIND THE SCENES. For example, why are two-dimensional arrays slower than single-dimensional arrays? Or, for that matter, why are even one-dimensional arrays slow? Think of it like this... When you access an element in a single- dimensional array the computer has to find the the Base address of the array. Then it must find the offset to reference to element you requested. If you had something like A=MyTable(10) the computer would process this along these lines: 1. Find base address of array MyTable. 2. Calculate offset of element 10 by multiplying 10 X 4 (4 is size of integer). 3. Now add offset to array base to determine the element's memory address. 4. Finally, extract the value held in this memory location. As you can see that's quite a bit of work (particularly in a loop) just to do something that appears so simple on this end. Now, for a two dimensional array, the computer has twice the work, two multiplies to calculate to determine the address of any given two dimensional array elements such as A=MyTable(10,10). Like the Gosub replacing Procedure tip, this is a bit messier but, everything has a price so try this: MyTableBase=VarPtr(MyTable) Now instead of accessing the array through AMOS with A=MyTable(10) access it directly yourself like this A=Leek(MyTableBase+40). The 40 is the offset for element 10, multiplied by the integer size, 4. Used in a loop this access a one-dimensional array 20% faster than the standard AMOS code (ie A=Array(element)) and is 33% faster at accessing two-dimensional arrays. If you don't need long-word size variables, you can substitute Deek for Leek for a TINY bit more speed (allowing 25% faster access of one-dimensional arrays and 40% faster access of two-dimensional arrays). Should be better but, AMOS's Peek, Deek and Leek are kind of slow themselves!! ------------------------------------------------------------------------------ If you find any general purpose tips, please send them to me (at bwyatt@paston.co.uk) and I'll add them to the list! If you want proof that these tips do work, then take a look at two of my recent games, Borisball and Knockout 2. Apart from being tremendously good fun, they demonstrate how the AMOS speed limit can be broken!! Their Aminet positions are: game/demo/BorisBall.lha game/2play/KnockOut2.lha